A Bayes method for a Bathtub Failure Rate via two S-pathi 



Man-Wai Hc^ 



National University of Singapore 
(February 1, 2008) 

Abstract 

A class of semi-parametric hazard/failure rates with a bathtub shape is of interest. It does 
not only provide a great deal of flexibility over existing parametric methods in the modeling 
aspect but also results in a closed and tractable Bayes estimator for the bathtub-shaped 
failure rate (BFR) . Such an estimator is derived to be a finite sum over two S-paths due to an 
explicit posterior analysis in terms of two (conditionally independent) S-paths. These, newly 
discovered, explicit results can be proved to be a Rao-Blackwellization of counterpart results 
in terms of partitions that are readily available by a specialization of James (2005) 's work. We 
develop both iterative and non-iterative computational procedures based on existing efficient 
Monte Carlo methods for sampling one single S-path. Numerical simulations are given to 
demonstrate the practicality and the effectiveness of our methodology. Last but not least, 
two applications of the proposed method are discussed, of which one is about a Bayesian test 
for failure rates and the other is related to modeling with covariates. 

1 Introduction 



In reliability theory and survival analysis it is often important to understand a hazard rate (or 
failure rate) as it is interpreted as the propensity of failure of an item or death of a human being 
in the instant future given its survival until time t. There are a variety of shapes for the function, 
for example, constant, non-increasing, or non-decreasing, of which each corresponds to a different 
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life distribution. In particular, a class of life distributions which corresponds to a bathtub-shaped 
failure rate (BFR) has received considerable attention as most electronic, eletromechanical, and 
mechanical products and human beings are subject to a high risk for failures/deaths initially 
in an "infant mortality" phase, then to a lower and constant risk in the so-called "useful life" 
period and finally to an increasing risk with time during the so-called "wearout" phase. Many 
parametric families of distributions for BFRs have been proposed over the last few decades. Most 
of which typically involving three or more parameters are based on mixtures or generalizations 
of some common probability distributions, such as exponential, gamma, Weibull and Pareto 
distributions; see Rajarshi and Rajarshi (1988) and Lai, Xie, and Murthy (2001, Section 4) for 
an extensive and collective review. For discussion of parametric models for other typical hazard 
functions, see Kalbfleisch and Prentice (1980) and Lawless (1982). Also see Singpurwalla (2006) 
for a comprehensive discussion on reliability and risk from a Bayesian perspective. 

One of the contributions of the present paper is a closed and tractable nonparametric esti- 
mator of BFRs that serve as a viable estimator of any BFR and, hence, an alternative to most 
existing parametric inferences which suffer from intractability problems [Lawless (1982), Page 
255] and often resort to extensive iterative procedure [Haupt and Schabe (1997)]. The litera- 
ture on nonparametric estimation of BFRs is rather limited though there are some available 
testing procedures involving BFRs (see, for example, Bergman (1979), Aarset (1985) and Vau- 
rio (1999)). Amman (1984) (see also Laud, Damien and Walker (2006)) studied a ?7-shaped 
process by combining two random processes, of which one is the increasing random hazard rates 
based on extended gamma processes firstly considered by Dykstra and Laud (1981) and the other 
one is the decreasing counterpart defined analogously. However, the combined process does not 
necessarily generate BFRs. Reboul (2005) introduced a data-driven nonparametric estimator of 
BFRs which, though is not in a closed form, can be computed by applying the "Pool Adja- 
cent Violators Algorithm" (see Barlow, Bartholomew, Bremner, and Brunk (1972)). References 
on nonparametric inference of any of hazard, survivor, or cumulative hazard functions in sur- 
vival analysis include, for instance, Kaplan and Meier (1958), Watson and Leadbetter (1964a,b), 
Nelson (1969), Doksum (1974), Susarla and Van Ryzin (1976), Aalen (1978), Ferguson and Pha- 
dia (1979), Tanner and Wong (1983), Yandell (1983), Lo and Weng (1989), Hjort (1990), Wolpert 
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and Ickstadt (1998) and James (2005), among others; see Ghosh and Ramamoorthi (2003) for a 
review of works related to Bayesian nonpar ametrics, and see also Sinha and Dey (1997) for an 
extensive survey on semi-parametric modeling of survival data with presence of covariates. 

In line with James (2005) who studied random hazard rates with general shapes expressible 
as X{x\fi) = J K (x , u) fi{du) , wherein K(x,u) is a known positive measurable kernel on a Polish 
space X X U and /i is a completely random measure [Kingman (1967, 1993)] on U (see Lo and 
Weng (1989) for the case when fi is an extended/weighted gamma random measure), the present 
paper considers a semi-parametric family of hazard rates onTC = (0, oo) defined by, for t,0 € Ti, 



where I{A) is the indicator function of a set A and /i is a completely random measure on 
TZ = {—oo, oo). Argument of Brunner (1992) in constructing unimodal densities on the real line 
with mode 9 based on the mixture representation of a monotone failure rate (MFR) considered 
by Lo and Weng (1989) applies and justifies that ([T]) gives an BFR on Ti. with a minimum point, 
or a change point called by Mitra and Basu (1995), at 6 £ 7i. Posterior consistency of these BFRs 
can be established following Dragichi and Ramamoorthi (2003) who showed the corresponding 
result for the class of MFRs discussed in Ho (2006a), a subclass of ([T]) when 6 = or = oo. 
Exploiting the fine structure of an indicator kernel. Ho (2006a) improves the readily available 
explicit posterior analysis in terms of partitions in James (2005, Section 4) by giving a tractable 
and less complex (see Brunner and Lo (1989)) characterization in terms of one S-path for such 
MFRs, and shows that an efficiently designed algorithm for sampling an S-path, called the 
accelerated path (AP) sampler, results in less variable Bayes estimates of the hazard compared 
to a partition-based algorithm introduced by James (2005) via numerical simulations. In this 
work, we show that all BFRs defined in ([1]) possess nice and special structures that naturally 
arise in relation to two conditionally independent S-paths given 9 in Section [21 rather than 
one in the case of MFRs; for an BFR there are two (possibly different) non-decreasing curves 
away from the change point 9 in either direction, compared with only one such curve to the 
right of the origin for a non-decreasing hazard rate. In particular, an explicit characterization 
depending on two S-paths possessed by all such BFRs, which are unprecedentedly available, 
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generalizes the corresponding characterization of MFRs discussed in Ho (2006a) that depends 
on only one path, and, more importantly, yields a tractable Bayes estimator of BFRs as a finite 
sum over two S-paths. Understanding these novel characterization and estimator for BFRs is 
of statistical importance; they can be shown to be a Rao-Blackwellization of the partition- 
based counterparts, suggesting that more parsimonious methods for inference, compared with 
partition-based methods introduced in James (2005), would be available if one could efficiently 
sample the two paths in this context. To approximate posterior quantities for models in ([1]), 
Section [3] proposes an iterative Monte Carlo procedure based on the AP sampler. Furthermore, 
extensions of a sequential importance sampling (SIS) [Kong, Liu, and Wong (1994) and Liu and 
Chen (1998)] scheme for sampling one path at a time are introduced. Numerical results of the 
method are given in Section H] to demonstrate its practicality and effectiveness. Two applications 
of the methodology are given in the last two sections in which the proposed algorithms can 
be applied to approximate the posterior quantities of interest. A test of an MFR versus an 
BFR based on models in dH) is illustrated in Section O Section [H] shows that a two S-path 
characterization also exists in modeling with covariates by a proportional hazards model. 

2 Posterior analysis via two S-paths 

A class of random hazard rates with a bathtub shape on the half line Ti, defined by ([1]), is of 
interest. The law of /x is uniquely characterized by the Laplace functional 



where g is a non-negative function on TZ and p{dx\u)rj{du) is called the Levy measure of Also, 
H can be represented in a distributional sense as 



where M{dx, du) is a Poisson random measure, taking on points (x, u) in TC x TZ, with mean 
intensity measure 





E[A^((ix, dn)] = p{dx\u)ri{du) 



(3) 



such that 



min(x, l)p{dx\u)ri{du) < oo for any bounded set B € TZ. 
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Suppose we collect independent failure times T = (Ti, . . . , T/v) from items with a common 
continuous life distribution which corresponds to an BFR with change point at 9, specified by ([1]), 
until time r, so that < Ti < • • • < Tm < t denote m completely observed failure times, and 
Tm+i = • • • = Tjv = T are Nc = N—m number of right-censored times. Assuming a multiplicative 
intensity model discussed in Aalen (1975, 1978), the likelihood of the data T is proportional to 



g-/^(9jv,e) 



m « 

n / -e <Ui<0)+I{0 <u,<T,- 9)Mdui), (4) 



where 

N 



9n, 



[u 



.i=l 



[I{t - 9 < u < 0) +1(0 < u < t - 9)]dt 



is a piecewise linear function of n, and /^(^(^ = j^g^ Q{u)^{du) = /J" > t) X{t\fi,9)dt 

with X^iLi — called the total time on test (TTT) transform [Barlow, Bartholomew, Brem- 
ner, and Brunk (1972)]. Define fj^g{x,u) = g{u)x for any {x,u) € {TC,7l) and assume that 

Ke{e~^'^'Op\u) = / x^e~^^.9(")^p(dx|u) < oo, (5) 



for any positive integer I < m and a fixed u G TZ. 

The posterior distribution of the pair (^u, 9) in ([1]) given T with respect to any prior iT{d9) 
for 9 ^Ti can always be determined by the double expectation formula, 

E[/i(/i,^)|T] =E{E[/i(^,^)|0,T][T} = / / h{^l,9)V{d^l\9,T)V{d9\T), (6) 

Jn JM 

where h is any nonnegative or integrable function, M. is the space of measures over TZ, and, 
'P{dfj,\9,T) and 'P{d9\T) denote the conditional distribution of fi given (0,T) and the posterior 
distribution of 9 given T, respectively. 

Let us first look at V{dii\9,T) and then discuss V{d9\T) later on. Suppose < ^ < r, we 
can always assume that 

{T,-9,...,Tm-9) = Z'uY' = iZlzl,...,Zi_JUiY,',Yl...X), (7) 

where -9 ^ Zl < Zf < Z| < •■■ < < Zi_^^, ^ and ^ Y^^, < Y^ < Y^i < 

■■■<Yf < Y^ = T — 9 are referred to as negative and positive observations in the sequel. The 
relationship between these notation and the data T is illustrated in Figure [H graphed together 
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with the TTT transform. It is worthy of note that once a failure time Tj, i = l,...,m, is 
completely observed and compared with the given 9, the mixture hazard rates can be simplified 
as in one of two mutually exclusive situations specified by 

fi{z!l <u,<o)f,{dui), T^-e = z1<o, 



J 1(0 < < Y^)f,{dui), T,-e = Y^> 0, 



(8) 



for j = 1, . . . ,m—n and k = 1, . . . , n. This also implies that the missing variable Ui corresponding 
to Tj in is always greater (resp. smaller) than if Tj > (resp. <)9. This nice similification 
proves to be crucial in leading to the tractable path structure of BFRs in ([T|). 
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Figure 1: Illustration of the TTT transform and the relationship ([7]) be- 
tween T and (Y^Z^e). 



Define an integer-valued vector S = (^o, 5i, . . . , Sm-i-, Sm) [Lo and Weng (1989) and Brunner 
and Lo (1989)], referred to as an S-path (of m-|- 1 coordinates), which satisfies Sq = 0, Sm = m 
and Sj < min(j, 5j+i), j = 1, . . . , m — 1. An S-path is a combinatorial reduction of a partition 
in the sense that an S-path of m -|- 1 coordinates is said to correspond to one or many partitions 
p = {Ci, . . . , C„(p)} of the integers {1, . . . , m}, provided that (i) indices of the maximal elements 
of the n{p) cells C^'s in p coincide with locations j at which Sj > Sj^i, and (ii) number of 
indices of cell Ck for all /c = 1, . . . , n(p) with a maximal index j , j = 1, . . . , m, is identical to 
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Sj — Sj-i. Given a path S of m-|- 1 coordinates, let Cs denote the collection of all partitions that 
correspond to S. Then, the total number of partitions in Cs is given by [Brunner and Lo (1989)] 

icsi-Ei- n (^7_t'-')- 

peCs {i*|s} ^ ^ ^ 

where, conditioning on a path S of m + 1 coordinates, n{j*|s} stands for YYJLi-s >s i' Simi- 
larly, X]{j*|s} ^il^ stand for X]jLi-Sj>Sj_i- ^ee Ho (2002) for more discussion of the relationship 
between p and S. 

Theorem 2.1. Suppose that the likelihood of the data T is given by ^ and that ^ is a 
completely random measure characterized by the Laplace functional Then, the posterior 
distribution of fj, given 6 and T can be described as a mixture as follows: 

(i) Given {6, T), there are two paths S~ = (0, 5^, . . . , S^_^_^,m—n) and S''" = (0, Sf, . . . , S^_^,n), 
independently distributed as 

VF-(S-|^?,T)occ/)-(S-,T) = |Cs-| n j\^m-ie~^''''p\y)v{dy) (10) 

and 



T^+(S+|^,T)oc</.+ (S+,T) = |Cs+| n / 



t^^+{e-f^.op\y)r]{dy), (11) 



{i*|s+}' 

where jCg- | and |Cs+ 1 are defined in mj = Sj — Sj_-^,j = 1, ... ,7X1 — n and = 
•Sj*" — Sj~__i,j = 1, . . . , n. 

(ii) Given (S^,S^,^,T), there exist X^|j*|s-} 1 and ^{j*|s+} 1 independent pairs of {yJ,Qj) 
and {yj',Qj), denoted by (y",Q") = {{yJ,Qj) : mJ > 0,j = l,...,m-n} and 
(y^, Q^) = {{Uj^, Qj^) '■ rn^ > 0, j = 1, . . . , n}, respectively. They are distributed as 

r/,((iy7|S-,0,T) oc 1{Z'^ < yj < Q)K^^^^{e-f^:^ p\y-)r^{dy-), (12) 
Pr{Q7 G'i9|y7,S-,0,T} a q^^ e-^^^''^^^^^ p{dq\yj), (13) 

and 



r?,(dy+|S+,0,T) (X ^ < y^ <Yf)K^+{e'f^^o p\y+)^^dy+), (14) 
Pr{Q+ G(i(?|y+,S+,0,T} a q'^U-^^-'^yt^i p{dq\y+), (15) 

respectively, with existences guaranteed by ([5]). 
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(iii) Given (y~, Q^, S^, y+, Q^, S+, ^, T), fi has a distribution identical to that of the random 
measure 

{j*\S-} {j*\S+} 
where l^-gj^ g is a completely random measure with Levy measure e~^'^'^^'^^^ p{dx\u)r]{du). 

Proof. When 9 is given, Theorem 4.1 in James (2005) specializes and yields that the law of fj,\9, T 
can be described as the random measure g + Yl^^'^ Jj^Vi mixed over by the law of J, v, p|^, T, 
where J = (Ji, . . . , J„(p)), v = (fi, . . . , u„(p)) denotes the unique values of (ui, . . . , u^), and ^ 
is a completely random measure characterized by Levy measure e~^N-B^^^^ p{dx\u)rj{du) with law 
denoted by V[dpg^ g). That is, it can be determined by the joint distribution of jig^^ g,J,v, p\9, T, 
which is proportional to 'Pidpg^ g) multiplies 

n(p) 

W Ji'^'e-^^^o^'^'^-^'pidJilvi) II [lin -0<Vi<0)+I{0<v^<Tk- eMdvi). (16) 

i=l k€Ci 

Rewriting T as and as defined in ([7]) and simplifying the sums of two indicators due 
to dS]) reveal that the m — n negative observations can "cluster" only with one another but 
not with any of the positive observations Y^, or vice versa. Hence, it is eligible to "split" p 
into two non-overlapping partitions and p^. Write p = p^ U p^. Without loss of generality, 
let p^ = {Ci, . . . , C„(p-)} and p+ = {Cn(p-)+i, • • • , Cn(p)} denote the partition of the m — n 
negative observations Z^ and that of the remaining n positive observations Y^ in relation to 
negative and positive unique values in v, respectively. Hence, the law of J, v, p|^, T, proportional 
to (fT6]) . becomes 



n(p ) 

n 

i=l 



j.e«e"S'^>«(''')-^"p(dJi|vi)I(maxZf < Vi < 0)i]idvi 



n(p) 

X n 

i=n(p )+l 



Ji^'e-3^'0^'"^'^y{dJi\vi)I{0 <Vi< mmY^)rj{dvi) 



(17) 



Due to its dependence on the maximal index but not the remaining indices of each cell in both 
p~ and p+, this can be represented in terms of the intrinsic characteristics of two paths S~ and 
S+ of respectively m — n + 1 and n + l coordinates via relabeling of {(fi, Ji), . . . , (f,i(p-), Jn{p-))} 
and {(w„(p-)+i, J„(p-)+i ),..., (t;„(p),J„(p))} respectively as (y",Q~) and (y+,Q+) according 
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to p € Cs- and p+ S Cs+, together with equahties, 



n < < 0) = n I(^max,,c.fc 

i=l 1=1 



<v,<0)= J] 1(^1 < yr < 0) 
{i*|s-} 



and 



n(p) n(p) 

H I{0<v,<nnn Y,')= J] 1(0 < t;, < Y^,^^^^ ,) = J] m<yt<Yf). 

i=n(p-)+l ' i=n(p-)+l ' {j*lS+} 



That is, (jl6p or (jl7p can be equivalently expressed as 



{i*|s-} 



{i*|s+} 



(18) 



In other words, the law of g, J, v, p|0, T only depends on p through and S^. The above 
equality of (jl6p and (jlSp together with the following relation of equivalence in distribution 
between the two random measures, 
n(p) 



i=l 



^TV, (19) 



{j*|S-} {i*|S+} 

imply that the law of fi\9, T can be described as the random measure fi* at the right-hand side 
above mixed over by the law of Q^, y^, S^, Q+, y+, S+|0, T, which is proportional to 



[Cs-I n {iQjr^e-^^''>^y7^'^JpidQj\yj)I{Z^ < yj < OMdyj)] 
{i*|s-} 



X 



ics+[ n {(Q 

{i*|s+} 



o^yt^'^tpidQ+ly+nO < y+ < YfUdyp} 



(20) 



and obtained by summing over all p~ G Cg- and p"*" G Cs+ in (jlSp . Now, the laws given 
by (jlOmsp . together with the conditional independence relationships among them, follow from 
Bayes' theorem and multiplication rule, completing the proof. □ 

Corollary 2.1. The posterior mean of the BFRs in ([1]) given 6 and T is given by, for t € [0, r], 



E[A(t|M, e)\9, T] = E E ' ^' ^(^^ ' ^^1^' 

s- s+ 

where represents summing over all paths S of the same number of coordinates. 



(21) 



VF(S-,S+|0,T) = W-{S-\e,T) X T^+(S+|0,T) 



10 



Ho 



is the conditional distribution of (S , S+) given 6 and T, and 
aA(i|S-,S+,^,T) 



f \,ie~f^p\yMdy)+ ^ A+.(t|S+) 

•''0 



I{t < 9) 
l{t > 6), 



{i*|s+} 



max(t-e,ZJ) 



fO t-O 

wherein Ag .(t|S") = / _ i^„r P\y)v{dy) / J^^ i^raji^^'^'^'" P\y)^{dy), for j = 1, . . . , 

fmm{t-e,y») ^ /-y" 

K^++i(e~-^'^-V|y)??(c^y)/ / ' i^„,+ {e'^''^'p\y)ri{dy), for 



m — n, and A^^(t|S+) 
i = l,...,n. 







Proof. If u = (lii, . . . , Um), the posterior mean of p given (u, 9, T) follows from Theorem 12.11 as 

E[i2*{du)\u, 9, T] = E[fi*{du)\y-,S-,y+, S+,e, T] 

= Ki{e'f^^op\uHdu)+ E[Q-\y-]6y-idu)+ ^ i?[Q+|y+]5^+ (dn), 

{i*|s-} ' {i*|s+} 



where jy^ 



(e ^'^•Vbj )/'«m7(e ^'^■"Pb,- ) and 



-^[Q^bi'^] = i^rn++i(^ ^'^''^ P\yt) / ^m+^^ ^^''^ p\yt)- Hence, the posterior mean of X{t\p,9) given 



9 and T is 



V y I / - 6l<n<0) + I(0<u<t - 6i)]Ki(e--^^^.V|w)??(rfii) 
+ [ Kt-0<yj <0)E[Qj\yT]n-{dyj\S-,9,T) 

+ E ^i(o<y+<t-WQ;iy,+]^+(^y,+|s+,e,T)W(s-,s+|0,T) 



{i*|s+} 

and the result follows by comparing between t and 9. □ 

Remark 2.1. When = or = cxd, Theorem 12 . 1 1 and Corollarv 12 . 1 1 reduce to a characterization 
of the posterior distribution and the posterior mean of the class of MFRs discussed in Ho (2006a) 
via one single S-path. 

With the following posterior consistency result, which is an analogue of Theorem 4 in Dragichi 
and Ramamoorthi (2003) in this context, the consistency of the above Bayes estimator of BFRs 
with a change point 9 can be established via the same argument used in Corollary 1 of Barron, 
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Schervish and Wasserman (1999). Suppose Aq is the true BFR defined in ([T]), with a correspond- 
ing density function /q. 

Theorem 2.2. Suppose 6 is known and that max(lim £'[A(t|/x, ^)], lim E[X{t\iJ,,6)]) < cxd in ([1]). 
If Ao is bounded with Ao(6'_ 6*), Ao(0+|/i, ^) > 0, weak consistency holds at /q. 

Proof. The proof follows from that of Theorem 4 in Dragichi and Ramamoorthi (2003) by 
splitting the argument based on an increasing hazard rate on (0, oo) into two parallel situations 
with respect to 9, as there are two increasing hazard rates away from 9 of which one is increasing 
from to CO and the other one is increasing from to 0. □ 

Remark 2.7 in Ho (2006c) explains that the above characterization of the posterior distribu- 
tion and the estimator (j21|) for models in ([1]) based on two S-paths result in significant improve- 
ments in terms of complexity, compared with the counterparts in terms of partitions from the 
general result of James (2005). More importantly, dividing (jlSp . which is the joint distribution 
of (J,v,p) given 9 and T, by ([20|) . the joint distribution of (Q~, y^, S^, Q+, y+, S^") given 9 
and T, yields the following analogue of Corollary 2.4 in Ho (2006c) which states that given 
(S^, S^, ^, T), p is uniformly distributed over all partitions that can be split into p~ and of 
which correspond to the respective paths S~ and S"*". Consequently, the results in Theorem 12.11 
and Corollary 12. H which follow from the same argument as in Ishwaran and James (2003) or 
Ho (2006a) to be always less variable than their counterparts in terms of p, are worthy of study 
due to the posterior consistency result. 

Corollary 2.2. Consider models in ([I]). Suppose S-,S+|6',T ~ W"(S-, S+|6', T). Then, there 
exists a conditional distribution 



where ICg-j and |Cs+| are defined in Q. 

Theorem 2.3. Suppose the likelihood of the data T given (fi, 9) is proportional to (j3]). Assume 
that /i is a completely random measure with Levy measure ([3]) and the prior of 9 is TT{d9). The 
posterior distribution of 9 is characterized by, for any Borel set B ^Ti, 



7r(p|S 



S+,^,T) 



1 



P = P Up+,p gCs-,p^gCs+ 



|Cs-l|Cs 




s- s+ 



(22) 
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where 



7r(S-,S+,(i0|T) (X /:^(5^,e|p,r?)</'e (S-,T)</<+(S+,T)7r(d0) 



(23) 



defines a joint distribution of (S ,S^,6) given T, with a normalizational constant 



C^igNMP, V) Es+ Es- '/'^ T) 0^(8+, T) 7r{di3) and /:^(>, r/), (/>e"(S-, T) and 0^(S+, T) 



defined in (l2|), (fTOj) and ([TT]) . respectively. 

Proof. Applying Proposition 2.1 in James (2005) and following the same argument as in proving 
Theorem 12.11 vield a joint distribution of (J,v,p,^) given T, which is proportional to the ex- 
pression (fT6]) multiplies C^{giy,e\p,'n)'^{d9). Integrating (J, v), which is equivalent to integrating 
(Q^, y^, Q"*", y+) in (fTH]) . gives a joint distribution of {S^,S^,9) given T as in ((23]) . Result 
follows from further marginalization of (S^, S^). □ 

When 6 is not known, posterior analysis of models in ([T]) follows from ([6]) with V{d6\T) 
defined above. For instance, the posterior mean of hazard rates in ([T]) given T is given by 



where aA(i|S , S^, 9, T) is defined in Corollary I2.1[ 

3 Monte Carlo procedures 

This section introduces Monte Carlo procedures for evaluating/approximating posterior quan- 
tities of models in ([T|), like (I2ip . (I22p and (I24[) . which are expressible as finite sums over two 
S-paths, based on sampling the triplets (S~,S~'',^) in light of the data T. For brevity, condi- 
tioning statements on the data T will be suppressed throughout in this section as all sampling 
procedures are designed with respect to distributions conditioning on T. Firstly, when is given, 
both iterative and non-iterative procedures for sampling the paths (S^,S^) will be discussed. 
Then, a sequential importance sampling (SIS) scheme for drawing the triplets from the poste- 
rior distribution 7r(S^, S^,d9\T) in ()23p is proposed. Conditional independence between and 
given 9 and T stated in statement (i) of Theorem 12.11 the nice structure of the posterior 
distribution for models in ([T]), plays a crucial role in constructing all the algorithms that follow. 




s 



s+ 



(24) 
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3.1 When 9 is known 
3.1.1 A Gibbs sampler 

Define a generalization of the accelerated path (AP) sampler introduced in Ho (2002) (see also 
Ho (2006a,b)), which is an efficient MCMC algorithm for sampling one single S-path at a time 
in the context of Bayes estimation of monotone hazard rates and monotone densities, as follows. 

Algorithm 3.1 (The AP sampler). A Markov chain of S-paths of n + 1 coordinates with a 
unique stationary distribution, 

7r(S)cx</>(S) = |Cs| n ^^""'H^j)^ (25) 

where ip^^^i^Xj) is a finite real- valued function depending on mj and Xj only, and Xi, . . . ,Xn 
is a decreasing/increasing sequence in TZ, can be defined by a transition cycle of n — 1 steps: 

(I) At step r, suppose S* = {0, Si, . . . , Sr-i,c, . . . ,c, Sq, . . . , Sn-i,n), where Sr-i < c < 
min(r, Sq — 1) and q > r denotes the next location at which = Sq — Sq-i > 0. The 
chain moves from S* to 8**^ ^ = (0, 5*1, . . . , Sr-i, k, . . . ,k, Sq, . . . , n) with conditional 
probability proportional to (j){S**^ ^) for k = Sr-i, Sr-i + 1, Sr-i + 2, . . . , min(r, Sq — 1). 

(II) Repeat step (I) for r = 1, 2, . . . , n — 1 to complete a cycle. 

Starting with an arbitrary path S^g), and repeating M cycles according to the above scheme, 
give a Markov chain S(o), . . . , S{j\/) with a unique stationary distribution '7r(S). We remark 
that the sequence of determination of coordinates Si in the AP sampler does not have much 
effect on its effectiveness or efficiency. 

As a consequence of conditional independence between S~ and S+ given 9 and T, an iterative 
scheme, dubbed as accelerated paths (APs) sampler, for sampling a pair of (S^,S+) from the 
posterior distribution VF(S-, S+161, T) = W-{S-\e,T) x W+{S+\9,T) in Corollary O can be 
defined naturally by two independent implementations of the AP sampler, or, by cycling through 
the following two steps in a cycle: 

(Ml) Determine S~ by applying Algorithm 13.11 with n, (f>{S), Xi, . . . , A„ and ^/^*^™'j)(Aj) replaced 
by m - n, </>g"(S", T), Zf,..., and f^g k^- {e-f^^e p\y)r]{dy), respectively. 
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(M2) Determine S+ by applying Algorithm 13.11 with 0(8), Xi, . . . ,X„, and replaced 

ye 

by (f>i{S+,T), Y^^, ...,Y^ and ^ K^+ie'^^.o p\y)v{dy), respectively. 

A Markov chain (S^^, S^^, (S^-j, S^^, . . . , (S^^^^ ^(a-/)) '^ith a unique stationary distribu- 
tion W{S^ , S^\6, T) can be obtained by starting with an arbitrary pair of paths S^^ and Sj^, 
and repeating M cycles of steps (Ml) and (M2). Then, expectations of any functional h{S^ , S^) 
with respect to the probability distribution W{S^ , S^\6, T) can be approximated by the ergodic 
average [Meyn and Tweedie (1993)] 

1 ^ 

i=l 

For instance, the posterior mean E[A(t|/i, ^)|^, T] in (j2ip can be approximated by 

1 *^ 

<'^W = ME«A(i|S^),S+,^,T). (26) 

i=l 

3.1.2 A sequential importance sampling method 

Due to the same reason as for constructing the APs sampler, we propose an SIS [Kong, Liu and 
Wong (1994) and Liu and Chen (1998)] method for sampling the two paths from W{S^ , S^\9, T) 
which is designed as two independent implementations of an SIS scheme for sampling one path 
at a time, called the sequential importance path (SIP) sampler introduced in Ho (2006c). The 
SIP sampler is an SIS scheme that allows us to draw an S-path of n + 1 coordinates according 
to a probability distribution vr(S) oc (p{S) defined by (p5]) . Let Iq = and /„ = n. 

Algorithm 3.2 (The SIP sampler in Ho (2006c)). Based on a random permutation = 
{Ii, . . . , In-i} of the integers {1, 2, . . . , n — 1}, an SIS method for sampling an S-path of n + 1 
coordinates from vr(S) given in (j25p consists of recursive applications of the following SIS steps 
for r = 1, . . . , n — 1: 

A. Given Dr-i = {Iq} U {Ii, . . . , Ir} U {In}, which is the collection of all indices i whereby Si 
has been determined up to step r — 1, let p = max{/j G Dj—i : Ij < I,} and q = minjlj € 
Dr-i : Ij > Ir}- Determine Sj^ = k, for k = Sp,Sp + 1, . . . ,min(/^,5q), according to a 
probability distribution 

ar{k\{Sh : h € Dr-i}) oc (/>(Sj^, ;,), 
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where S^^ = (0, S**, . . . , 5|^_p 5|^^^, . . . , S*_i,n) is a path of n + 1 coordinates such 

that Sj^ = k and for i = 1, . . . , Ir — 1, Ir + ^, ■ ■ ■ ,n — l, S* = Sj^ if z = //i G -Dr-i ; otherwise, 
c* c* 

B. Compute ar{k\{Sh ■ h G Dr-i}), which equals (j){S*j^ f,) multiphed by the appropriate 
constant of proportionahty, for the chosen value k of Sj^. 

After step n — 1, a random path S = (0, Si, S2, • • • , Sn~i,n) distributed as 

n-l 

a„_i(S) = H ar{SiJ{Sh ■■ h G Dr-i}) (27) 

can be obtained. The importance sampling weight of this realized path S is given by f n_i(S) = 
(/)(S)/(T„_i(S). Or, S is said to be properly weighted by a weighting function t;„_i(S) with respect 
to the distribution 7r(S) in ([25]) [Liu and Chen (1998)]. 

Algorithm 3.3 (Sequential importance paths (SIPs) sampler). For a fixed value of 6, an SIS 
method for sampling a random pair of (S^, S+) from the posterior distribution W{S~ , S^\9, T) 
consists of the following three steps: 

(51) Obtain and based on 9 according to ([7]). Get random permutations H^-n-i and 
En-i of the integers {1, . . . , m — n — 1} and {1, . . . , n — 1}, respectively. 

(52) Determine S~ of m — n + 1 coordinates by applying Algorithm 13.21 based on Hm-n-i 
with n, 0(8), Xi, . . . , X„ and ^/>(™j)(Xj) replaced by m — n, 0^(8", T), Zf,..., Z^_^ and 
/z« '^m" respectively. Obtain o"m-n-i(8^ |0) according to (p7|) . 



(S3) Determine 8"*" of n + 1 coordinates by applying Algorithm 13.21 based on with (?!)(8), 

and replaced by </.+ (8+,T), Yf,...,Y^ and 

yS 

Jq ^ K +(e~f^'^p\il)r]{dy), respectively. Obtain (T„_i(8+|6') according to ([TT]). 

i 

The pair (8^, 8^) is said to be properly weighted by a weighting function 

0,(8-,T)0+(8+,T) 



Wm-2,e(8 ,8+) 



a„_„_i(8-|0)cTn-i(8+|0)' 
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wherein m — 2 in the subscript representing the total number of SIS steps, with respect to 
H^(S~, S+|0, T). Note that steps (S2) and (S3) above are interchangeable as the two paths are 
conditionally independent given 9. Replicating the above algorithm M times gives M iid pairs of 
draws, (S^^, S^^, . . . , (S^^j^, S^^^), with respective importance sampling weights, 
'^m-2,e(S^-), S^^), . . . , ti;m-2,6»(S(^j^, S^^^. Then, expectations of any functional h{S^ , S+) with 
respect to the probability distribution W{S^ ,S^\0,T) can be approximated by 

M _ E»=l K%) , ) t^m-2.e (S(^) , S+ ) 

For example, the posterior mean E[A(t|//, T] in ([2T|) can be approximated by 

M _ Efii«A(t|S(:),S+ ,g,T)^^„2.e(S(:),S+ ) 
Z^i=i^^m-2,eia(j),»(.-)j 

3.2 When 9 is unknown — SIPs(^^) sampler 

When 9 £ Ti. is unknown, we can design an SIS scheme, dubbed as SIPs{9) sampler, which is 
basically as a slight extension of the SIPs sampler (Algorithm I3.3p , for sampling the triplets 
from Tr{S^ ,d9\T) in (|23p : inserting the following step, 

(SO) Sample 9 according to a density p{9) > 0, 9 £ TZ, 

before implementing the three steps (S1-S3) in Algorithm l3.3l gives a random sample of (S~, S"*", 9) 
which is properly weighted by a weighting function 

^+ _ C,{g^,e\p, ^)0g (S-, T) 0+(S+, T) TrjO) 
u^m-i[^ ,b ,9)- ^_^(s-|^)a„_i(S+|0)p(e) 

with respect to 7r(S^, S+, (i0[T) if ^{(19) = n{9)d9. Note that the total number of positive ob- 
servations n is no longer a constant as it is in Algorithm 13.31 n, depending on 9, is fixed in 
step (SI) only after each determination of 9 in step (SO). Suppose we implement the SIPs(0) 
sampler independently for M times to get M iid draws of the triplets, (S^^ S^^, 0(i)), . . . , 
(S^^, S^j-^, 0(^f)), with respective importance sampling weights, u;m-i(S^',, Sj^, 0^]^)), . . . , 
LOm-i{^~[My^tM)'^iM))- For any function /i(S", S+, 6*), 



E 



[MS-,S+,^)|T]^ / ^^/i(S-,S+,e)7r(S-,S+,d0|T)^r?f 
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where 



Vh 



Hence, in Theorem l2.31 the posterior probabihty (j22p can be approximated by setting /i(S^ , S+, 6) 
l{e G B), that is, 



Pr(6' G B\T) = ¥.[1(9 G B)|T] 



Similarly, regarding the Bayes estimate of the BFRs in ([T]) given by ()24p , we have 



E[A(t|/i,0)|T]«r?*f(t) 



(29) 



(30) 



4 Numerical Results 



This section illustrates the methodology with numerical examples. For purpose of illustration, 
fj, is selected to be a gamma process with shape measure as a uniform density on [— 2t, 2t], that 
is, a completely random measure with Levy measure 

p{dx\u)ri{du) = x~^e~^I{x > 0) dx x -^I(-2r < u < 2t) du, 

as it results in closed and easily manageable expressions for most quantities that appear so 
far. The prior ^{dO) is chosen to be uniformly distributed on a reasonably large interval on 
7i to "deflate" the prior belief. Simulated data are generated from two bathtub-shaped life 
distributions to test the methodology. The life distributions correspond to BFRs given by 



Ai(t) 



1, 



and 



A2(t) 



,-2/3 



-2.5i 



-2.5 



-6+0.7i 



< t < 0.5, 
0.5 < t < 3, 
t > 3. 

< t < 1, 
1< t < 5, 
t > 5, 



(31) 



(32) 
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respectively. The censoring rates in the data sets governed by hazard rates (|3ip and (j32p are 
about 15% and 20% by setting termination times r = 4 and r = 8, respectively. Last but not 
least, Monte Carlo size M = 10, 000 is chosen for implementations of the proposed SIS methods 
in all results that follow. 

Our attention is to first investigate whether the iterative scheme and the SIS method work 
well when 6 is fixed. The APs sampler discussed in Section 13.1.11 and the SIPs sampler (Algo- 
rithm [3i3]) are implemented based on a fixed value of 0, wherein the APs sampler is initialized 
by paths S^^ and with coordinates = = i, for all i, to produce totally M = 1,000 
pairs of paths in the sense that samples are taken once every 5 cycles after a "burn-in" period 
of 5, 000 cycles. As there is a long interval in which the test BFRs (j3ip and (j32p attain their 
minimum value, both the algorithms are implemented with three different values of 6 in order to 
see whether there is any significant effect of different choices of 9 on the performance. For fitting 
Ai(t), 6 is fixed at 0.5, 1.75 and 3, whereas for fitting A2(t), 1, 3 and 5 are selected. In partic- 
ular, the convergence property of the approximated hazard rate estimates as the total number 
of observations N increases is studied. Figures [2] and [3] depict ergodic averages produced 
by the APs sampler with the aforementioned different values of 9 based on nested samples of 
sizes N = 500, 1,000 and 3,000 from the Hfe distribution governed by BFRs ^ and ([52]) . 
respectively. Corresponding weighted average estimates (|28p produced by the non-iterative SIPs 
sampler for approximating ()2ip are graphed in the first three rows of Figures H] and [5l 

To investigate the performance of the SIPs(0) sampler when 6 is not known, we set p{0) to 
be uniform on an interval which includes all the complete observations. Independent random 
samples of {S^,S^,6) of size M = 10,000 are resulted from implementing the sampler based 
on the same sets of nested samples of sizes N = 500, 1,000 and 3,000 according to the two 
hazard rates Ai(i) and A2(t). For the sake of a better comparison between results by the SIPs(0) 
sampler based on an unknown 9 and those by the SIPs sampler with a fixed 9, the resulting 
Bayes estimates of the BFRs (j3T]) and p2|) , given by the weighted average (j30|) , are presented 
in the last rows of Figures S] and [5l respectively. 

In summary, the graphs echo the fact that approximations for Bayes estimates of the BFRs 
in ([1]) by all the proposed algorithms tend to the "true" hazard rates, Ai(t) and A2(t), as sample 
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size increases. We remark that some other simulations we have carried out applying the APs 
and the SIPs samplers based on fixed values of 6 other than those stated above reveal that there 
is not much difference between simulation results based on different values of 0. 



N = 500; m = 435 



N = 1000; m = 845 



N = 3000; m = 2569 












Figure 2: The true bathtub-shaped hazard rate Xi{t) (solid line) given 
by (j3ip and the Bayes estimates produced by the APs sampler based on 
total number of observations, N = 500 (left column), 1,000 (middle column) 
and 3, 000 (right column), with 6 = 0.5, 1.75 and 3 (from top row to bottom 
row). 
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N = 500;m = 400 N = 1000; m = 782 N = 3000; m = 2381 
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Lj:Lj:Li 



1 1 1 1 r 



1 1 1 1 r 

02468 02468 02468 

Figure 3: The true bathtub-shaped hazard rate \2{t) (sohd hne) given 
by ()32p and the Bayes estimates produced by the APs sampler based on total 
number of observations, N = 500 (left column), 1,000 (middle column) and 
3, 000 (right column), with 9 = 1,3 and 5 (from top row to bottom row). 
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N = 500;m = 435 N = 1000; m = 845 N = 3000; m = 2569 
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012340123401234 




01234012340123 



Figure 4: The true bathtub-shaped hazard rate Ai(t) (sohd line) given 
by ([3T]) and the Bayes estimates produced by the SIS methods based on 
total number of observations, N = 500 (left column), 1000 (middle column) 
and 3000 (right column), wherein estimates in the first three rows from 
top to bottom are obtained by the SIPs sampler (Algorithm 13. 3p with 6 = 
0.5, 1.75 and 3, respectively, and those in the last row are obtained by the 
SIPs(^) sampler with an unknown 9. 
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N = 500: m = 400 



N = 1000; m = 782 



N = 3000; m = 2381 
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Figure 5: The true bathtub-shaped hazard rate X2{t) (sohd hne) given 
by ([32]) and the Bayes estimates produced by the SIS methods based on 
total number of observations, N = 500 (left column), 1000 (middle column) 
and 3000 (right column), wherein estimates in the first three rows from 
top to bottom are obtained by the SIPs sampler (Algorithm 13. Sp with 9 = 
1, 3 and 5, respectively, and those in the last row are obtained by the SIPs(0) 
sampler with an unknown 9. 
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5 A Test of an MFR Versus an BFR 



Early references devoted to testing for a constant hazard rate versus an MFR include Proschan 
and Pyke (1967), Bickel and Doksum (1969) and Gail and Gastwirth (1978a, b), among others. 
Without relying on exponentiality assumption, Gijbels and Heckman (2004) develop a testing 
procedure via normalized spacings for testing an MFR against alternatives of some local de- 
partures. For testing an MFR versus other general alternatives. Hall and Van Keilegom (2005) 
propose a calibration method related to the "increasing bandwidth" approach suggested by Sil- 
verman (1981) in the case of density estimation. Testing procedures involving BFRs can be found 
in, for example, Aarset (1985), who discussed the test statistic proposed by Bergman (1979) for 
testing a constant hazard rate against an BFR, and Vaurio (1999), who proposed a few test 
statistics for testing between an MFR and other non-monotone alternatives including BFRs. 

A Bayesian test of monotone versus bathtub-shaped hazard rates can be readily defined in 
terms of 9 based on the models in ([T]) with /u being a nuisance parameter as follows: Suppose 
we are interested in testing whether a set of observations T, defined similarly in Section [21 is 
generated according to a non-decreasing hazard rate or an BFR. Based on ([T]), it is equivalent 
to choose between two hypotheses Hq ■.9 = and Hi : ^ G (0, oo) as when 9 = 0, models in ([T]) 
correspond to a class of non-decreasing hazard rates; otherwise, they give a class of BFRs with 
a change point ^ > 0. In particular, the likelihood of the data given (/x, 9) under Hi is given 
by dH) when 9 ^ or oo, while the likelihood of the data given /j, under Hq follows from ^ with 
i9 = as 



Let VTo denote the prior probability of Hq, and then 1 — ttq denotes the prior probability of Hi; 
furthermore, suppose the mass on Hi is spread out according to a distribution Tr{d9). Suppose we 
assume that //'s under Hq and Hi are two independent, but not necessarily identical, completely 
random measures characterized by ([2]). 




m 



(33) 



i=l 



Corollary 5.1. Suppose ^ is a completely random measure characterized by ([2]). It follows from 
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Theorem 12.31 that the hkehhood of the data T given 9 is proportional to 

me{T)=£^{gM,e\P,v) x ^^0, (S-,T) x ^0+(S+,T). (34) 

s+ 

Hence, the marginal density of T is given by 

m(T)=7rox£^(5jv,o|p,??)J^</'^(S+,T) + (l-^o)x f m,(T)7r(d^). (35) 

s+ •^^ 

It implies that the posterior probability of Hq is given by 



P{Ho\T) 



m(T) 



and that of Hi is equal to 1 — P{Ho\T). Also of interest is the posterior odds of Hq to Hi, which 
is given by 

X 



1-vro f^me{T)7T{de) 

wherein ito/{1 — ttq) is the prior odds and the latter ratio is the Bayes factor for Hq versus 
Hi (see Kass and Raftery (1995) for a review of Bayes factors). 

Regarding implementation of the above Bayesian test, Algorithm 13 . 2 1 and the SIP(0) sampler 
can be applied to approximate the marginal density of T, m(T), in ()35p . and also the posterior 
probabilities of Hq and Hi. On one hand, the sum J2s+ '^oi^^^ approximated by 

1 ^ 

— ^CT^_l(S(i)), 

i=l 

if S(o), . . . , S(7v/) are independent samples obtained via implementing Algorithm 13.21 with 
0(S) = 0(|"(S,T) in (125]) and (Tm-i(S(j)) defined in (f27ll . On the other hand, the integral 
J^mg{T)iT{d9) is approximated by 

1 ^ 

— CJm_„(^j _l (S( .) ) C7„(^j _l (S J) ) p(^(i) ) , 
i=l 

if (S^p S^^, 6'(;^) ),..., (S^^p 0(7V/)) are independent samples obtained via implementing 
the SIP(6') sampler, whereby n(j) is determined in step (SI) after 6'(j) is fixed in step (SO), and 
Cm-n(i)-i(S^^ |^(j)) and cj„j.j_i(S^^ |0(j)) are obtained from steps (S2) and (S3), respectively. 
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The Cox regression model [Cox (1972)] is an important example of the multiplicative intensity 
model that can allow incorporation of covariates, together with right independent censoring, 
in survival analysis. For Bayes inference of general hazard rates with presence of covariates, 
see Kalbfleisch (1978), Ibrahim, Chen and MacEachern (1999), James (2003) and Ishwaran and 
James (2004), among others. Suppose we collect failure data until time r, which are governed 
by an underlying hazard rate on 7i associated with a p-dimensional covariate vector X G TZ^, 



\it\X,p,^i,e) = X{t\fi,e)exp{(3'^X), 



where \{t\n, 0) defined in ([T]) is an unknown baseline hazard rate of a bathtub shape and (5 € TV 
is an unknown parameter vector. The data D = ((Ti, Xi), . . . , (T^r, X^v)) summarize completely 
observed failure times Ti < ■■■ < and right-censored times Tj = r, i = m + 1, . . . , N , 
associated with covariate vector Xj, i = 1, . . . , A^, respectively. Define ^51(^5 = 9n ^ei'^)^-: 
for any (x,ii) G {TC,7l), where 



N 



Y,I{Ti>t)exp{f3^X,) 



.1=1 



[I{t -e <u<0) + I{0 <u<t- e)]dt. (36) 



Then, the Cox proportional hazards likelihood may be written as 



nexp(/3^X,)A(I-,|^,( 



U=l 



exp [-K9N^f3^g)] , 



(37) 



where KOj^ i^ q) = jT^gj^j^ g{u)^i{du) = > t) exp(/3^Xi)]A(t|/i, 6*)^^. Assume 

Jj^x^e~^'^'i3,B^^^^ p{dx\u) < 00, for £ = l,...,m and a fixed n > 0. If iT{d(3) and -n^dO) are 
independent priors for /3 and 9, applying the same arguments in proving Theorems 12.11 and 12.31 
yields that the law of ^|D is equivalent to that of a random measure pg^^ ^ g + Z]{j*|s-} QJ\t 

^t^y^^ where fJ-g^ g, with law denoted by F{dfig^ ^ g), is a completely random measure 
with Levy measure e'^'^'f^^'^^'^^^ p{dx\u)rj{du). It is determined by the law of 
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A'Siv./s.e'Q 'Q^'^^'^^'^'/^l^' which is proportional to 

m 

P(d^3^_^J^(d0)7r(d/3)£^(5^^^,|p,r/)nexp(/3^X,) 

i=l 

x|Cs-| n {(Q7P'e-^'-.«^^^"^'^^"pW7|y7)I(^^<y7<0)ry(dy7)} 
0*|s-} 

x|Cs+| n {iQtr^e-3^M)Qrp(dQ^\yl)I{0<y+ <YfUdyl)]. 
{i*|s+} 

Analogous results with presence of covariates of Theorems 12.11 and 12.31 in terms of two S-paths 
can be obtained via Bayes' theorem and multiplication rule. 

Proposition 6.1. Suppose the likelihood of the data is given by (j37p . Assume that /x is a 
completely random measure characterized by the Laplace functional and independently, let 
7r((i/3) and Ti{d9) denote independent priors for /3 and 9. Then, 

(i) the law of /i|0,/3, D can be described by a three-step hierarchical experiment as in Theo- 
remEU of which /jy •) and % g(-) are replaced by fj^^^ g{-, •) and 9jv,Ae(')' respectively. 

(ii) the law of 0|/3,D is characterized by, for any Borel set -B G 

Pv{0 e B\(3,B) = I j;^7r(S-,S+,d0|/3,D), 

JB g_ g+ 

where 7r(S-, S+, (i0|/3, D) oc r/) x |Cs- 1 Tlo-is-} ^m" 

To evaluate any posterior quantities of model (j37p . such as the posterior mean of the under- 
lying bathtub-shaped baseline hazard rate and the posterior mean of the covariate parameters 
(3, run the following Gibbs sampler to obtain random samples from the posterior distribution of 
(Q-,y-,S-,Q+,y+,S+,^,/3) given D: 

1. Draw S^, S^IQ^, y^, Q+, y+, S^, 0, /3, D by independently implementing Algorithm 13.11 
as in steps (Ml) and (M2) in Section [3.1.11 

2. Draw Q^,y^, Q+,y+|S^, S+,^,/3,D according to the analogues of the conditional dis- 
tributions p^HT5]) in Theorem 12.11 with and replaced by and 
5'7V,/3,e(')' respectively. 
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3. Draw ^|Q~, y~, S~, Q+, y+, S+,/3,D from the density proportional to 

{r\s-} {j*\s+} 

4. Draw /3|Q~, y~, S~, Q+, y"*", D from the density proportional to 

m 

^=1 0*|s-} {j*|s+} 

Note that Qj^ f^g{u) is again a piecewise linear function of u as g^g{u) in the case without 
covariates. This does not create any complexities in evaluating integrals at steps 1 and 2 of the 
above Gibbs sampler (see discussion of Remark 5.1 in Ho (2006a)). Step 4 above, which is of 
the same form as the step 4 (for conditional draws of regression parameters (3) of the Blocked 
Gibbs algorithm suggested by Ishwaran and James (2004, page 184), can be dealt with via a 
Metropolis step, while step 3 can also be done similarly as the density looks like the one in 
step 4. 
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